Spatial data

  • Lots of data in our field(s) are inherently spatial.
  • R has lots of tools for interacting with and analyzing spatial data:

the sf package

  • The sf (simple features) package simplifies dealing with spatial data in R.
  • It simplifies the storage of spatial data into a dataframe with a special column called geometry.
  • This geometry column allows R to plot real maps (to scale), and allows you to do lots of geospatial analyses.

actual spatial data versus coordinates

We often just have coordinate data, but this works differently in R from fully spatial data

myCoords <- data.frame(long = runif(20, min=35, max=36), lat = runif(20, min = 3, max=5))
plot(myCoords)

making spatial objects

For a gold star: Which is the x and which is the y when we are dealing with latitude and longitude?

library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
SFmyCoords <- st_as_sf(myCoords, coords = c("long", "lat"))
st_crs(x = SFmyCoords) <- "EPSG:4326" #shortcut for the most common projection (lat/long with WGS 84 Datum)

smarter plots

plot(SFmyCoords, axes=TRUE, xlab="longitude (degrees)", ylab="latitude (degrees)", pch=16)

fully spatial object

Now there are a plethora of spatial things you can do with these points that you couldn’t before

#boundary box
st_bbox(SFmyCoords)
##      xmin      ymin      xmax      ymax 
## 35.009496  3.079992 35.923433  4.984301

fully spatial object

Add a buffer around the points, and then plot

buffered <- st_buffer(SFmyCoords, 0.1)
plot(buffered, axes=TRUE, col="red", main="sites with 0.1 degree buffer")

fully spatial object

Check out the sf cheatsheet (here) to see the kind of spatial operations you can do

why does it matter to treat spatial data correctly?

DC_and_Nairobi <- data.frame(lat=c(-1.2921,38.9072), long=c(36.8219,-77.0369), name=c("Nairobi", "DC"))
DC_and_Nairobi
##       lat     long    name
## 1 -1.2921  36.8219 Nairobi
## 2 38.9072 -77.0369      DC
citiesSpatial <- st_as_sf(DC_and_Nairobi, coords=c("long", "lat"))
st_crs(x = citiesSpatial) <- "EPSG:4326" 
#shortcut for the most common projection (lat/long with WGS 84 Datum)

why does it matter to treat spatial data correctly?

dist(DC_and_Nairobi)
## Warning in dist(DC_and_Nairobi): NAs introduced by coercion
##          1
## 2 147.8841
st_distance(citiesSpatial)
## Units: [m]
##          [,1]     [,2]
## [1,]        0 12142308
## [2,] 12142308        0

rnaturalearth

The rnaturalearth package provides maps that you can use in your plots.

It can be a bit hard to install this package.

#try this first
install.packages(c("rnaturalearth", "rnaturalearthdata", "rnaturalearthhires"))

#if the above fails, try this
install.packages("devtools")
devtools::install_github("ropensci/rnaturalearth")
devtools::install_github("ropensci/rnaturalearthdata")
devtools::install_github("https://github.com/ropensci/rnaturalearthhires")

rnaturalearth

library(rnaturalearth)
world <- ne_countries(scale = "medium", returnclass = "sf")
plot(world$geometry)

These objects can have lots of attributes

Because these are sf objects, they have attributes in data.frame which come in handy later for plotting

head(world)
## Simple feature collection with 6 features and 63 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -70.06611 ymin: -18.01973 xmax: 74.89131 ymax: 60.40581
## Geodetic CRS:  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
##   scalerank      featurecla labelrank     sovereignt sov_a3 adm0_dif level
## 0         3 Admin-0 country         5    Netherlands    NL1        1     2
## 1         1 Admin-0 country         3    Afghanistan    AFG        0     2
## 2         1 Admin-0 country         3         Angola    AGO        0     2
## 3         1 Admin-0 country         6 United Kingdom    GB1        1     2
## 4         1 Admin-0 country         6        Albania    ALB        0     2
## 5         3 Admin-0 country         6        Finland    FI1        1     2
##                type       admin adm0_a3 geou_dif     geounit gu_a3 su_dif
## 0           Country       Aruba     ABW        0       Aruba   ABW      0
## 1 Sovereign country Afghanistan     AFG        0 Afghanistan   AFG      0
## 2 Sovereign country      Angola     AGO        0      Angola   AGO      0
## 3        Dependency    Anguilla     AIA        0    Anguilla   AIA      0
## 4 Sovereign country     Albania     ALB        0     Albania   ALB      0
## 5           Country       Aland     ALD        0       Aland   ALD      0
##       subunit su_a3 brk_diff        name     name_long brk_a3    brk_name
## 0       Aruba   ABW        0       Aruba         Aruba    ABW       Aruba
## 1 Afghanistan   AFG        0 Afghanistan   Afghanistan    AFG Afghanistan
## 2      Angola   AGO        0      Angola        Angola    AGO      Angola
## 3    Anguilla   AIA        0    Anguilla      Anguilla    AIA    Anguilla
## 4     Albania   ALB        0     Albania       Albania    ALB     Albania
## 5       Aland   ALD        0       Aland Aland Islands    ALD       Aland
##   brk_group abbrev postal                    formal_en formal_fr note_adm0
## 0      <NA>  Aruba     AW                        Aruba      <NA>     Neth.
## 1      <NA>   Afg.     AF Islamic State of Afghanistan      <NA>      <NA>
## 2      <NA>   Ang.     AO  People's Republic of Angola      <NA>      <NA>
## 3      <NA>   Ang.     AI                         <NA>      <NA>      U.K.
## 4      <NA>   Alb.     AL          Republic of Albania      <NA>      <NA>
## 5      <NA>  Aland     AI                Åland Islands      <NA>      Fin.
##   note_brk   name_sort name_alt mapcolor7 mapcolor8 mapcolor9 mapcolor13
## 0     <NA>       Aruba     <NA>         4         2         2          9
## 1     <NA> Afghanistan     <NA>         5         6         8          7
## 2     <NA>      Angola     <NA>         3         2         6          1
## 3     <NA>    Anguilla     <NA>         6         6         6          3
## 4     <NA>     Albania     <NA>         1         4         1          6
## 5     <NA>       Aland     <NA>         4         1         4          6
##    pop_est gdp_md_est pop_year lastcensus gdp_year                    economy
## 0   103065     2258.0       NA       2010       NA       6. Developing region
## 1 28400000    22270.0       NA       1979       NA  7. Least developed region
## 2 12799293   110300.0       NA       1970       NA  7. Least developed region
## 3    14436      108.9       NA         NA       NA       6. Developing region
## 4  3639453    21810.0       NA       2001       NA       6. Developing region
## 5    27153     1563.0       NA         NA       NA 2. Developed region: nonG7
##                income_grp wikipedia fips_10 iso_a2 iso_a3 iso_n3 un_a3 wb_a2
## 0 2. High income: nonOECD        NA    <NA>     AW    ABW    533   533    AW
## 1           5. Low income        NA    <NA>     AF    AFG    004   004    AF
## 2  3. Upper middle income        NA    <NA>     AO    AGO    024   024    AO
## 3  3. Upper middle income        NA    <NA>     AI    AIA    660   660  <NA>
## 4  4. Lower middle income        NA    <NA>     AL    ALB    008   008    AL
## 5    1. High income: OECD        NA    <NA>     AX    ALA    248   248  <NA>
##   wb_a3 woe_id adm0_a3_is adm0_a3_us adm0_a3_un adm0_a3_wb     continent
## 0   ABW     NA        ABW        ABW         NA         NA North America
## 1   AFG     NA        AFG        AFG         NA         NA          Asia
## 2   AGO     NA        AGO        AGO         NA         NA        Africa
## 3  <NA>     NA        AIA        AIA         NA         NA North America
## 4   ALB     NA        ALB        ALB         NA         NA        Europe
## 5  <NA>     NA        ALA        ALD         NA         NA        Europe
##   region_un       subregion                 region_wb name_len long_len
## 0  Americas       Caribbean Latin America & Caribbean        5        5
## 1      Asia   Southern Asia                South Asia       11       11
## 2    Africa   Middle Africa        Sub-Saharan Africa        6        6
## 3  Americas       Caribbean Latin America & Caribbean        8        8
## 4    Europe Southern Europe     Europe & Central Asia        7        7
## 5    Europe Northern Europe     Europe & Central Asia        5       13
##   abbrev_len tiny homepart                       geometry
## 0          5    4       NA MULTIPOLYGON (((-69.89912 1...
## 1          4   NA        1 MULTIPOLYGON (((74.89131 37...
## 2          4   NA        1 MULTIPOLYGON (((14.19082 -5...
## 3          4   NA       NA MULTIPOLYGON (((-63.00122 1...
## 4          4   NA        1 MULTIPOLYGON (((20.06396 42...
## 5          5    5       NA MULTIPOLYGON (((20.61133 60...

works with ggplot2 using geom_sf

library(ggplot2)
ggplot(data=world) + 
  geom_sf(aes(fill=as.numeric(pop_est))) + 
  scale_fill_viridis_c() + 
  coord_sf(expand=F) #don't expand the map to an even number

labeling

countries <- ne_countries(scale="medium", country=c("Kenya", "United Republic of Tanzania", "Ethiopia", "Uganda", "Rwanda", "Burundi") ,returnclass = "sf")
ggplot(countries) + 
  geom_sf(aes(fill=sovereignt)) + geom_sf_label(aes(label=sovereignt), size=2) + guides(fill="none")

more detailed maps from rnaturalearth

ethiopia <- ne_states(country="Ethiopia", returnclass = "sf")
ggplot(ethiopia) + 
  geom_sf(aes(fill=name))

Challenge

Use rnaturalearth::ne_countries() to download map data for Brazil. Plot this and make it pretty.

RGoogleMaps

library("RgoogleMaps")
## 
## Thank you for using RgoogleMaps!
## 
## To acknowledge our work, please cite the package:
##  Markus Loecher and Karl Ropkins (2015). RgoogleMaps and loa: Unleashing R
##   Graphics Power on Map Tiles. Journal of Statistical Software 63(4), 1-18.
GW <- GetMap(center = c(38.899, -77.049), zoom = 17)
PlotOnStaticMap(GW)

RGoogleMaps

GWSatellite <- GetMap(center = c(38.899, -77.049), zoom = 17, maptype = "satellite")
PlotOnStaticMap(GWSatellite, lon=-77.0493, lat=38.899, pch=16, col="red", cex=2)

GIS in R

You can do almost anything in R that you can do in ArcGIS.

This is accomplished through additional packages, notably rgdal which is an R interface to the Geospatial Data Abstraction Library (GDAL), which is a great piece of open source software for spatial analysis. GDAL is the brains behind QGIS.

The spatstats package also implements a variety of spatial statistics.

One issue: R can be very slow with raster data (e.g. images, satellite data etc) so if you are heavy into this you need to use other software.

Ask me 3 questions about spatial data in R!